SnoMedTagger: A Semantic Tagger for Medical Narratives
نویسندگان
چکیده
The identification and classification of semantic information in medical narratives is critical for various research applications such as question-answering systems, statistical analysis, etc. Our contribution is a novel semantic tagger named SnoMedTagger to tag complex semantic information (paraphrases of concepts, abbreviations of concepts, complex multiword concepts) with 16 SNOMED CT semantic categories in medical narratives. SnoMedTagger is developed to support domain users as well as non-domain users working on research questions using medical narratives. Our method includes corpus-based rule-patterns from real world dataset and rule-patterns developed by refinement of SNOMED CT (Systemised NOmenclature of MEDicine-Clinical Terms) clinical vocabulary. These rulepatterns were able to identify semantic information in a range of text and classify them with respective semantic categories derived from SNOMED CT. On unseen gold standard, our rulepattern-based semantic tagger outperformed SVM-based machine learning system and Ontology-based Bioportal web annotator. The study has shown that it is possible to identify and classify complete semantic information with SNOMED CT semantic categories in medical narratives with high accuracy than achieved by existing approaches. 82 SAMAN HINA, ERIC ATWELL, OWEN JOHNSON
منابع مشابه
Maternal Medical Information Extraction (MaMIE) System
This paper discusses the development of an information extraction system for maternal health records. Current investigated maternal health records that exist in hospitals are written in a long and detailed paragraphs in text documents. Information extraction processes unstructured, natural language text, such as maternal health records, to extract useful information from the text. MaMIE (Matern...
متن کاملAutomatically Recognizing Medication and Adverse Event Information From Food and Drug Administration’s Adverse Event Reporting System Narratives
BACKGROUND The Food and Drug Administration's (FDA) Adverse Event Reporting System (FAERS) is a repository of spontaneously-reported adverse drug events (ADEs) for FDA-approved prescription drugs. FAERS reports include both structured reports and unstructured narratives. The narratives often include essential information for evaluation of the severity, causality, and description of ADEs that ar...
متن کاملTowards comprehensive syntactic and semantic annotations of the clinical narrative
OBJECTIVE To create annotated clinical narratives with layers of syntactic and semantic labels to facilitate advances in clinical natural language processing (NLP). To develop NLP algorithms and open source components. METHODS Manual annotation of a clinical narrative corpus of 127 606 tokens following the Treebank schema for syntactic information, PropBank schema for predicate-argument struc...
متن کاملAnnotation of Clinical Narratives in Bulgarian language
In this paper we describe annotation process of clinical texts with morphosyntactic and semantic information. The corpus contains 1,300 discharge letters in Bulgarian language for patients with Endocrinology and Metabolic disorders. The annotated corpus will be used as a Gold standard for information extraction evaluation of test corpus of 6,200 discharge letters. The annotation is performed wi...
متن کاملAn Efficient Inductive Unsupervised Semantic Tagger
We report our development of a simple but fast and efficient inductive unsupervised semantic tagger for Chinese words. A POS hand-tagged corpus of 348,000 words is used. The corpus is being tagged in two steps. First, possible semantic tags are selected from a semantic dictionary(Tong Yi Ci Ci Lin), the POS and the conditional probability of semantic from POS, i.e., P(S|P). The final semantic t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013